Modeling Hessian-vector products in nonlinear optimization: new Hessian-free methods
نویسندگان
چکیده
In this paper, we suggest two ways of calculating interpolation models for unconstrained smooth nonlinear optimization when Hessian-vector products are available. The main idea is to interpolate the objective function using a quadratic on set points around current one and concurrently curvature information from Hessian times appropriate vectors, possibly defined by interpolating points. These enriched conditions form then an affine space model Hessians or Newton directions, which particular can be computed once equilibrium least secant principle defined. A first approach consists recovering matrix satisfying conditions, direction computed. second pose recovery problem directly in direction. techniques lead significant reduction overall number compared inexact truncated method, although simple implementations may pay cost linear algebra evaluations.
منابع مشابه
Saddle-free Hessian-free Optimization
Nonconvex optimization problems such as the ones in training deep neural networks suffer from a phenomenon called saddle point proliferation. This means that there are a vast number of high error saddle points present in the loss function. Second order methods have been tremendously successful and widely adopted in the convex optimization community, while their usefulness in deep learning remai...
متن کاملPreconditioning for Hessian-Free Optimization
Recently Martens adapted the Hessian-free optimization method for the training of deep neural networks. One key aspect of this approach is that the Hessian is never computed explicitly, instead the Conjugate Gradient(CG) Algorithm is used to compute the new search direction by applying only matrix-vector products of the Hessian with arbitrary vectors. This can be done efficiently using a varian...
متن کاملBlock-diagonal Hessian-free Optimization
Second-order methods for neural network optimization have several advantages over methods based on first-order gradient descent, including better scaling to large mini-batch sizes and fewer updates needed for convergence. But they are rarely applied to deep learning in practice because of high computational cost and the need for model-dependent algorithmic variations. We introduce a variant of ...
متن کاملHessian Free Optimization Methods for Machine Learning Problems
In this article, we describe the algorithm and study the performance of a Hessian free optimization technique applied to machine learning problems. We implement the commonly used black box model for optimization and solve a particular challenging recursive neural network learning problem, which exhibits a non-convex and non-di erentiable function output. In order to adapt the method to machine ...
متن کاملDeep learning via Hessian-free optimization
We develop a 2nd-order optimization method based on the “Hessian-free” approach, and apply it to training deep auto-encoders. Without using pre-training, we obtain results superior to those reported by Hinton & Salakhutdinov (2006) on the same tasks they considered. Our method is practical, easy to use, scales nicely to very large datasets, and isn’t limited in applicability to autoencoders, or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Ima Journal of Numerical Analysis
سال: 2021
ISSN: ['1464-3642', '0272-4979']
DOI: https://doi.org/10.1093/imanum/drab022